Considering Autocorrelation in Predictive Models
نویسنده
چکیده
Most machine learning, data mining and statistical methods rely on the assumption that the analyzed data points are independent and identically distributed (i.i.d.). More specifically, the individual examples included in the training data are assumed to be drawn independently from each other from the same probability distribution. However, cases where this assumption is violated can be easily found: For example, species are distributed non-randomly across a wide range of spatial scales. The i.i.d. assumption is often violated because of the phenomenon of autocorrelation. The cross-correlation of an attribute with itself is typically referred to as autocorrelation: This is the most general definition found in the literature. Specifically, in spatial analysis, spatial autocorrelation has been defined as the correlation among data values, which is strictly due to the relative location proximity of the objects that the data refer to. It is justified by Tobler’s first law of geography [1] according to which “everything is related to everything else, but near things are more related than distant things”. In network studies, autocorrelation is defined by the homophily principle [2] as the tendency of nodes with similar values to be linked with each other.
منابع مشابه
Control chart based on residues: Is a good methodology to detect outliers?
The purpose of this article is to evaluate the application of forecasting models along with the use of residual control charts to assess production processes whose samples have autocorrelation characteristics. The main objective is to determine the efficiency of control charts for individual observations (CCIO) and exponentially weighted moving average (EWMA) charts when they are applied to res...
متن کاملTESTING FOR AUTOCORRELATION IN UNEQUALLY REPLICATED FUNCTIONAL MEASUREMENT ERROR MODELS
In the ordinary linear models, regressing the residuals against lagged values has been suggested as an approach to test the hypothesis of zero autocorrelation among residuals. In this paper we extend these results to the both equally and unequally replicated functionally measurement error models. We consider the equally and unequally replicated cases separately, because in the first case the re...
متن کاملFast Autocorrelated Context Models for Data Compression
A method is presented to automatically generate context models of data by calculating the data’s autocorrelation function. The largest values of the autocorrelation function occur at the offsets or lags in the bitstream which tend to be the most highly correlated to any particular location. These offsets are ideal for use in predictive coding, such as predictive partial match (PPM) or context-m...
متن کاملGIS-Based Analytical Tools for Transport Planning: Spatial Regression Models for Transportation Demand Forecast
Considering the importance of spatial issues in transport planning, the main objective of this study was to analyze the results obtained from different approaches of spatial regression models. In the case of spatial autocorrelation, spatial dependence patterns should be incorporated in the models, since that dependence may affect the predictive power of these models. The results obtained with t...
متن کاملGlobal and Local Spatial Autocorrelation in Predictive Clustering Trees
Spatial autocorrelation is the correlation among data values, strictly due to the relative location proximity of the objects that the data refer to. This statistical property clearly indicates a violation of the assumption of observation independence a pre-condition assumed by most of the data mining and statistical models. Inappropriate treatment of data with spatial dependencies could obfusca...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Informatica (Slovenia)
دوره 37 شماره
صفحات -
تاریخ انتشار 2013